Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for metadata columns (location, size, last_modified) in ListingTableProvider #74

Open
wants to merge 8 commits into
base: spiceai-43
Choose a base branch
from

Conversation

phillipleblanc
Copy link

@phillipleblanc phillipleblanc commented Feb 26, 2025

Which issue does this PR close?

TBD

Rationale for this change

This enables another way to prune files that don't need to be read, similar to partitioning, but based on the metadata of the file itself. This can be used to efficiently find all data that are in files that have changed since I last did a query, for example: SELECT * FROM test WHERE last_modified > {last_check_time}.

What changes are included in this PR?

Adds a new option to the ListingOptions for specifying certain metadata properties. The metadata properties that are supported are location, size and last_modified. When those properties are included, then they are added to the table schema (similar to partition columns) and the value is filled in by looking at the ObjectMeta for the file.

Are these changes tested?

Yes, added tests to path_partition.rs.

Are there any user-facing changes?

@phillipleblanc phillipleblanc self-assigned this Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant